NLP

Jointly Optimizing Diversity and Relevance in Neural Response Generation

本文提出了正则化的多任务学习框架SpaceFusion,通过结构化隐变量空间联合优化多样性和相关性。NAACL2019

paper link
code link

Introduction

本文研究的是对话生成问题,传统的seq2seq模型往往会生成平淡通用的回复,为了提高生成回复的多样性和相关性,大致有两类工作:

  • Decoding/ranking:仅在预测解码的时候优化,通过上下文相关信息对beam search的结果进行重排序。缺点是需要很大的beam size。_A diversity-promoting objective function for neural conversation models_
  • Training/latent space:使用CVAE来建模discourse-level的多样性。缺点是损失了回复的相关性(在没有额外的dialogue act的情况下)。_Learning discourse-level diversity for neural dialog models using conditional variational autoencoders_

本文的思路是在训练的时候联合优化多样性和相关性,通过对齐下面两个模型:

  • Sequence-to-Sequence(S2S): latent vector of context
  • Autoencoder(AE): latent vectors of multiple possible diverse responses

一种简单的方式是多任务学习:

但这种方法的缺点在于很难对齐两种隐变量空间:

因此,本文提出了一种几何的方法SPACEFUSION,得到结构化的隐变量空间,使得预测回复的距离和方向分别代表相关性和多样性,如下图所示:

The SPACEFUSION Model

给定数据集 $\mathcal{D}=\left[\left(x_{0}, y_{0}\right),\left(x_{1}, y_{1}\right), \cdots,\left(x_{n}, y_{n}\right)\right]$,$x_{i}$ $y_{i}$ 分别代表上下文和回复,模型目标是生成相关且多样性的回复。

SPACEFUSION核心是两个正则化的loss:

  • pull S2S and AE dots closer to each other:
    enter image description here
    实验中d是欧氏距离。
  • encourage a smooth transition between S2S and AE:

  • Finally combine them with vanilla multi-task loss:

Inference:预测时,从半径$|r|$(超参数)随机采样r,以$z(x,r)$解码端GRU的初始状态,采用greedy decoding
$$
z(x, r)=z_{\mathrm{S} 2 \mathrm{S}}(x)+r
$$

Structured latent space

The regularization terms induce some desired structure of the latent space: Semantic -> Geometry

  • Diversity -> direction: as $L_{interp}$ regularized semantic along a line
  • Relevancy -> distance: as $L_{fuze}$ regularized distance

Direction & diversity

SpaceFusion tend to map different possible responses to different direction

Interpolation & smoothness

Experiments

Conclusion

本文提出了正则化的多任务学习框架SpaceFusion,通过结构化隐变量空间联合优化多样性和相关性。

Reference